308 research outputs found

    Probabilistic identification of bacterial essential genes via insertion density using TraDIS data with Tn5 libraries

    Get PDF
    Motivation Probabilistic Identification of bacterial essential genes using transposon-directed insertion-site sequencing (TraDIS) data based on Tn5 libraries has received relatively little attention in the literature; most methods are designed for mariner transposon insertions. Analysis of Tn5 transposon-based genomic data is challenging due to the high insertion density and genomic resolution. We present a novel probabilistic Bayesian approach for classifying bacterial essential genes using transposon insertion density derived from transposon insertion sequencing data. We implement a Markov chain Monte Carlo sampling procedure to estimate the posterior probability that any given gene is essential. We implement a Bayesian decision theory approach to selecting essential genes. We assess the effectiveness of our approach via analysis of both simulated data and three previously published Escherichia coli, Salmonella Typhimurium and Staphylococcus aureus datasets. These three bacteria have relatively well characterized essential genes which allows us to test our classification procedure using receiver operating characteristic curves and area under the curves. We compare the classification performance with that of Bio-Tradis, a standard tool for bacterial gene classification. Results Our method is able to classify genes in the three datasets with areas under the curves between 0.967 and 0.983. Our simulated synthetic datasets show that both the number of insertions and the extent to which insertions are tolerated in the distal regions of essential genes are both important in determining classification accuracy. Importantly our method gives the user the option of classifying essential genes based on the user-supplied costs of false discovery and false non-discovery. Availability and implementation An R package that implements the method presented in this paper is available for download from https://github.com/Kevin-walters/insdens. Supplementary information Supplementary data are available at Bioinformatics online

    Pathotyping the Zoonotic Pathogen Streptococcus suis: Novel Genetic Markers To Differentiate Invasive Disease-Associated Isolates from Non-Disease-Associated Isolates from England and Wales.

    Get PDF
    Streptococcus suis is one of the most important zoonotic bacterial pathogens of pigs, causing significant economic losses to the global swine industry. S. suis is also a very successful colonizer of mucosal surfaces, and commensal strains can be found in almost all pig populations worldwide, making detection of the S. suis species in asymptomatic carrier herds of little practical value in predicting the likelihood of future clinical relevance. The value of future molecular tools for surveillance and preventative health management lies in the detection of strains that genetically have increased potential to cause disease in presently healthy animals. Here we describe the use of genome-wide association studies to identify genetic markers associated with the observed clinical phenotypes (i) invasive disease and (ii) asymptomatic carriage on the palatine tonsils of pigs on UK farms. Subsequently, we designed a multiplex PCR to target three genetic markers that differentiated 115 S. suis isolates into disease-associated and non-disease-associated groups, that performed with a sensitivity of 0.91, a specificity of 0.79, a negative predictive value of 0.91, and a positive predictive value of 0.79 in comparison to observed clinical phenotypes. We describe evaluation of our pathotyping tool, using an out-of-sample collection of 50 previously uncharacterized S. suis isolates, in comparison to existing methods used to characterize and subtype S. suis isolates. In doing so, we show our pathotyping approach to be a competitive method to characterize S. suis isolates recovered from pigs on UK farms and one that can easily be updated to incorporate global strain collections.This work was supported by a Biotechnology and Biological Sciences Research Council (BBSRC) Knowledge Transfer Network CASE studentship co-funded by Zoetis (previously Pfizer Animal Health UK) and with significant contribution from BQP Ltd (Award Reference: BB/L502479/1). Funding bodies provided scholarship support but had no part in study design, data collection, analysis and interpretation of data or in writing the manuscript. AWT is supported by a BBSRC Longer and Larger (LoLa) grant (Award Reference: BB/G019274/1). LAW is supported by a Dorothy Hodgkin Fellowship funded by the Royal Society (Grant Number: DH140195) and a Sir Henry Dale Fellowship co-funded by the Royal Society and Wellcome Trust (Grant Number: 109385/Z/15/Z)

    Comparative genomics of isolates of a pseudomonas aeruginosa epidemic strain associated with chronic lung infections of cystic fibrosis patients

    Get PDF
    Pseudomonas aeruginosa is the main cause of fatal chronic lung infections among individuals suffering from cystic fibrosis (CF). During the past 15 years, particularly aggressive strains transmitted among CF patients have been identified, initially in Europe and more recently in Canada. The aim of this study was to generate high-quality genome sequences for 7 isolates of the Liverpool epidemic strain (LES) from the United Kingdom and Canada representing different virulence characteristics in order to: (1) associate comparative genomics results with virulence factor variability and (2) identify genomic and/or phenotypic divergence between the two geographical locations. We performed phenotypic characterization of pyoverdine, pyocyanin, motility, biofilm formation, and proteolytic activity. We also assessed the degree of virulence using the Dictyostelium discoideum amoeba model. Comparative genomics analysis revealed at least one large deletion (40-50 kb) in 6 out of the 7 isolates compared to the reference genome of LESB58. These deletions correspond to prophages, which are known to increase the competitiveness of LESB58 in chronic lung infection. We also identified 308 non-synonymous polymorphisms, of which 28 were associated with virulence determinants and 52 with regulatory proteins. At the phenotypic level, isolates showed extensive variability in production of pyocyanin, pyoverdine, proteases and biofilm as well as in swimming motility, while being predominantly avirulent in the amoeba model. Isolates from the two continents were phylogenetically and phenotypically undistinguishable. Most regulatory mutations were isolate-specific and 29% of them were predicted to have high functional impact. Therefore, polymorphism in regulatory genes is likely to be an important basis for phenotypic diversity among LES isolates, which in turn might contribute to this strain's adaptability to varying conditions in the CF lung

    MOSAIC: an online database dedicated to the comparative genomics of bacterial strains at the intra-species level

    Get PDF
    BACKGROUND: The recent availability of complete sequences for numerous closely related bacterial genomes opens up new challenges in comparative genomics. Several methods have been developed to align complete genomes at the nucleotide level but their use and the biological interpretation of results are not straightforward. It is therefore necessary to develop new resources to access, analyze, and visualize genome comparisons. DESCRIPTION: Here we present recent developments on MOSAIC, a generalist comparative bacterial genome database. This database provides the bacteriologist community with easy access to comparisons of complete bacterial genomes at the intra-species level. The strategy we developed for comparison allows us to define two types of regions in bacterial genomes: backbone segments (i.e., regions conserved in all compared strains) and variable segments (i.e., regions that are either specific to or variable in one of the aligned genomes). Definition of these segments at the nucleotide level allows precise comparative and evolutionary analyses of both coding and non-coding regions of bacterial genomes. Such work is easily performed using the MOSAIC Web interface, which allows browsing and graphical visualization of genome comparisons. CONCLUSION: The MOSAIC database now includes 493 pairwise comparisons and 35 multiple maximal comparisons representing 78 bacterial species. Genome conserved regions (backbones) and variable segments are presented in various formats for further analysis. A graphical interface allows visualization of aligned genomes and functional annotations. The MOSAIC database is available online at http://genome.jouy.inra.fr/mosaic

    Microcin H47 System: An Escherichia coli Small Genomic Island with Novel Features

    Get PDF
    Genomic islands are DNA regions containing variable genetic information related to secondary metabolism. Frequently, they have the ability to excise from and integrate into replicons through site-specific recombination. Thus, they are usually flanked by short direct repeats that act as attachment sites, and contain genes for an integrase and an excisionase which carry out the genetic exchange. These mobility events would be at the basis of the horizontal transfer of genomic islands among bacteria

    EDGAR: A software framework for the comparative analysis of prokaryotic genomes

    Get PDF
    Blom J, Albaum S, Doppmeier D, et al. EDGAR: a software framework for the comparative analysis of prokaryotic genomes. BMC Bioinformatics. 2009;10(1): 154.Background:The introduction of next generation sequencing approaches has caused a rapid increase in the number of completely sequenced genomes. As one result of this development, it is now feasible to analyze large groups of related genomes in a comparative approach. A main task in comparative genomics is the identification of orthologous genes in different genomes and the classification of genes as core genes or singletons. Results: To support these studies EDGAR – ''Efficient Database framework for comparative Genome Analyses using BLAST score Ratios'' – was developed. EDGAR is designed to automatically perform genome comparisons in a high throughput approach. Comparative analyses for 582 genomes across 75 genus groups taken from the NCBI genomes database were conducted with the software and the results were integrated into an underlying database. To demonstrate a specific application case, we analyzed ten genomes of the bacterial genus Xanthomonas, for which phylogenetic studies were awkward due to divergent taxonomic systems. The resultant phylogeny EDGAR provided was consistent with outcomes from traditional approaches performed recently and moreover, it was possible to root each strain with unprecedented accuracy. Conclusion: EDGAR provides novel analysis features and significantly simplifies the comparative analysis of related genomes. The software supports a quick survey of evolutionary relationships and simplifies the process of obtaining new biological insights into the differential gene content of kindred genomes. Visualization features, like synteny plots or Venn diagrams, are offered to the scientific community through a web-based and therefore platform independent user interface http://edgar.cebitec.uni-bielefeld.de webcite, where the precomputed data sets can be browsed

    Data science

    Get PDF
    Even though it has only entered public perception relatively recently, the term "data science" already means many things to many people. This chapter explores both top-down and bottom-up views on the field, on the basis of which we define data science as "a unique blend of principles and methods from analytics, engineering, entrepreneurship and communication that aim at generating value from the data itself". The chapter then discusses the disciplines that contribute to this "blend", briefly outlining their contributions and giving pointers for readers interested in exploring their backgrounds further

    Mining Virulence Genes Using Metagenomics

    Get PDF
    When a bacterial genome is compared to the metagenome of an environment it inhabits, most genes recruit at high sequence identity. In free-living bacteria (for instance marine bacteria compared against the ocean metagenome) certain genomic regions are totally absent in recruitment plots, representing therefore genes unique to individual bacterial isolates. We show that these Metagenomic Islands (MIs) are also visible in bacteria living in human hosts when their genomes are compared to sequences from the human microbiome, despite the compartmentalized structure of human-related environments such as the gut. From an applied point of view, MIs of human pathogens (e.g. those identified in enterohaemorragic Escherichia coli against the gut metagenome or in pathogenic Neisseria meningitidis against the oral metagenome) include virulence genes that appear to be absent in related strains or species present in the microbiome of healthy individuals. We propose that this strategy (i.e. recruitment analysis of pathogenic bacteria against the metagenome of healthy subjects) can be used to detect pathogenicity regions in species where the genes involved in virulence are poorly characterized. Using this approach, we detect well-known pathogenicity islands and identify new potential virulence genes in several human pathogens

    Genome Sequencing Shows that European Isolates of Francisella tularensis Subspecies tularensis Are Almost Identical to US Laboratory Strain Schu S4

    Get PDF
    BACKGROUND: Francisella tularensis causes tularaemia, a life-threatening zoonosis, and has potential as a biowarfare agent. F. tularensis subsp. tularensis, which causes the most severe form of tularaemia, is usually confined to North America. However, a handful of isolates from this subspecies was obtained in the 1980s from ticks and mites from Slovakia and Austria. Our aim was to uncover the origins of these enigmatic European isolates. METHODOLOGY/PRINCIPAL FINDINGS: We determined the complete genome sequence of FSC198, a European isolate of F. tularensis subsp. tularensis, by whole-genome shotgun sequencing and compared it to that of the North American laboratory strain Schu S4. Apparent differences between the two genomes were resolved by re-sequencing discrepant loci in both strains. We found that the genome of FSC198 is almost identical to that of Schu S4, with only eight SNPs and three VNTR differences between the two sequences. Sequencing of these loci in two other European isolates of F. tularensis subsp. tularensis confirmed that all three European isolates are also closely related to, but distinct from Schu S4. CONCLUSIONS/SIGNIFICANCE: The data presented here suggest that the Schu S4 laboratory strain is the most likely source of the European isolates of F. tularensis subsp. tularensis and indicate that anthropogenic activities, such as movement of strains or animal vectors, account for the presence of these isolates in Europe. Given the highly pathogenic nature of this subspecies, the possibility that it has become established wild in the heartland of Europe carries significant public health implications
    corecore